10 Years of Emergency Response Calls By The Amsterdam Fire Department

By Dumky de Wilde

Introduction

The city of Amsterdam has recently released data from 2005-2015 containing all the responses to emergency calls by the fire brigade in Amsterdam’s greater metropolitan region. If you’ve always wanted to know what a day in the life of a fireman looks like, this is your chance to know what 3650 days in the lives of the 1200 people working for the fire department look like. I decided to add population data per district per year to the data set as well. These numbers, unfortunately, were only available for 2006-2015, and since the data for 2015 was only for the first half of the year, I’ve decided to remove the observations outside of 2006-2014.

Univariate Plots Section

We’ll start of with some of the basic elements of the data set.

str(brwaa)
## 'data.frame':    90305 obs. of  24 variables:
##  $ year                              : int  2006 2006 2006 2006 2006 2006 2006 2006 2006 2006 ...
##  $ district                          : Factor w/ 28 levels "","Stadsdeel Centrum",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ incident_id                       : int  128685 135425 137894 128161 127145 138639 139222 139016 126289 132439 ...
##  $ start_time                        : Date, format: "2006-11-11" "2006-06-13" ...
##  $ incident_type                     : Factor w/ 30 levels "(not specified)",..: 26 2 21 9 18 12 4 29 6 26 ...
##  $ national_incident_classification_1: Factor w/ 11 levels "Alarm","Bezitsaantasting",..: 4 5 4 3 4 4 4 4 3 4 ...
##  $ national_incident_classification_2: Factor w/ 92 levels "",".Te water               dienst",..: 2 25 24 92 2 24 6 24 17 2 ...
##  $ national_incident_classification_3: Factor w/ 283 levels "",".Aanrijd. Snelweg",..: 210 57 156 100 197 104 70 271 158 210 ...
##  $ date                              : Date, format: "2006-11-11" "2006-06-13" ...
##  $ month_nr                          : int  11 6 4 11 12 3 3 3 12 8 ...
##  $ month_name                        : Factor w/ 12 levels "April","Augustus",..: 10 7 1 10 3 8 8 8 3 2 ...
##  $ day_nr                            : int  11 13 13 23 18 26 12 17 31 16 ...
##  $ day_name                          : Factor w/ 7 levels "Dinsdag","Donderdag",..: 7 5 4 4 1 3 3 6 3 2 ...
##  $ week_nr                           : int  45 24 15 47 51 12 10 11 52 33 ...
##  $ quarter                           : int  4 2 2 4 4 1 1 1 4 3 ...
##  $ priority                          : int  1 1 3 1 2 3 1 2 1 1 ...
##  $ hour                              : int  22 20 9 13 8 21 0 11 23 16 ...
##  $ day_night                         : Factor w/ 4 levels "Avond","Middag",..: 1 1 4 2 4 1 3 4 1 2 ...
##  $ object_type                       : Factor w/ 5 levels "","bag","spoor",..: 4 2 2 2 4 2 2 2 2 4 ...
##  $ object_function                   : Factor w/ 27 levels "","Autosnelweg",..: 6 27 3 27 6 27 27 26 18 6 ...
##  $ neighbourhood                     : Factor w/ 168 levels "","Akkerland",..: 96 28 27 40 99 99 28 27 96 60 ...
##  $ city                              : Factor w/ 11 levels "","Aalsmeer",..: 1 4 4 4 1 4 4 4 4 1 ...
##  $ municipality                      : Factor w/ 32 levels "","Aalsmeer",..: 5 5 5 5 5 5 5 5 5 5 ...
##  $ population                        : int  80819 80819 80819 80819 80819 80819 80819 80819 80819 80819 ...

Next look at which incidents occur most and least.

#incidents under 100
brwaa %>%
  count(incident_type, sort = T) %>%
  filter(n < 100)
## Source: local data frame [9 x 2]
## 
##                incident_type     n
##                       (fctr) (int)
## 1                         NA    63
## 2         Inflammable gasses    50
## 3   Interregional assistance    49
## 4 Other hazardous substances    25
## 5          Injured personnel    16
## 6        Regional assistance    14
## 7               Reoccupation    11
## 8        Inflammable liquids     7
## 9            Decommissioning     1
#top 5 incidents
brwaa %>%
  count(incident_type, sort = T) %>%
  filter(incident_type %in% incident_type[1:5])
## Source: local data frame [5 x 2]
## 
##                           incident_type     n
##                                  (fctr) (int)
## 1                 OMS / automatic alert 12647
## 2                          Outside fire 12425
## 3                  Elevator confinement  7147
## 4                  Ambulance assistance  7008
## 5 Measurement / disturbance / pollution  6867

Next up are some plots. We’ll start with the univariate plots and look at the counts for some of the different variables.

ggplot(brwaa, aes(reorder(incident_type, table(incident_type)[incident_type]))) +
  geom_bar(fill=spectral_colors[1]) +
  geom_text(stat="count", aes(label=..count..), size=2, colour="white", hjust=1.1, fontface="bold") +
  bg_and_axes +
  scale_x_discrete(expand = c(0,0)) + 
  scale_y_continuous(expand = c(0,0)) + 
  coord_flip() +
  theme(axis.title = element_blank()) +
  ggtitle("Type of incident (count)")

ggplot(brwaa, aes(reorder(priority, table(priority)[priority]))) +
  geom_bar(fill=spectral_colors[9]) +
  geom_text(stat="count", aes(label=..count..), size=2, colour="white", hjust=1.1, fontface="bold") +
  bg_and_axes +
  scale_x_discrete(expand = c(0,0)) + 
  coord_flip() +
  theme(axis.title = element_blank()) +
  ggtitle("Priority of the response call (count)")

ggplot(brwaa, aes(reorder(object_function, table(object_function)[object_function]))) +
  geom_bar(fill=spectral_colors[1]) +
  geom_text(stat="count", aes(label=..count..), size=2, colour="white", hjust=1.1, fontface="bold") +
  bg_and_axes +
  scale_x_discrete(expand = c(0,0)) + 
  scale_y_continuous(expand = c(0,0)) + 
  coord_flip() +
  theme(axis.title = element_blank()) +
  ggtitle("The function of the object (count)")

ggplot(brwaa, aes(reorder(neighbourhood, table(neighbourhood)[neighbourhood]))) +
  geom_bar(fill=spectral_colors[9]) +
  geom_text(stat="count", aes(label=..count..), size=2, colour="white", hjust=1.1, fontface="bold") +
  bg_and_axes +
  scale_x_discrete(expand = c(0,0)) + 
  scale_y_continuous(expand = c(0,0)) + 
  coord_flip() +
  theme(axis.title = element_blank()) +
  ggtitle("The neighbourhood of the incident (count)")

ggplot(brwaa, aes(reorder(district, table(district)[district]))) +
  geom_bar(fill=spectral_colors[1]) +
  geom_text(stat="count", aes(label=..count..), size=2, colour="white", hjust=1.3, fontface="bold") +
  bg_and_axes +
  scale_x_discrete(expand = c(0,0)) + 
  scale_y_continuous(expand = c(0,0)) + 
  coord_flip() +
  theme(axis.title = element_blank()) +
  ggtitle("The district of the incident (count)")

In the final plots for this section, we’ll look at the distribution over time. First off, the frequency over time.

ggplot(brwaa, aes(date, ..count..)) +
  geom_freqpoly(binwidth=90) +
  scale_x_date(breaks = seq(as.Date("2005-01-01"), max(brwaa$date), 365.25)) +
  bg_and_axes +
  theme(axis.text.x = element_text(angle=45)) +
  ggtitle("Frequency polygon of incidents (per 90 days)")

Next up, with regard to time, we’ll look at the different counts for days of the week, months of the year, and the different years.

ggplot(brwaa, aes(day_name)) +
  geom_bar(fill=spectral_colors[1]) +
  geom_text(stat="count", aes(label=..count..), size=2.2, colour="white", hjust=1.3, fontface="bold") +
  bg_and_axes +
  scale_x_discrete(limits = c("Zondag","Zaterdag","Vrijdag",
                              "Donderdag","Woensdag","Dinsdag","Maandag"),
                   labels = days_of_week_EN, expand = c(0,0)) + 
  scale_y_continuous(expand = c(0,0)) + 
  coord_flip() +
  theme(axis.title = element_blank()) +
  ggtitle("Frequency per day of the week")

ggplot(brwaa, aes(reorder(month_name, -month_nr))) +
  geom_bar(fill=spectral_colors[9]) +
  geom_text(stat="count", aes(label=..count..), size=2.2, colour="white", hjust=1.3, fontface="bold") +
  bg_and_axes +
  scale_x_discrete(expand = c(0,1)) + 
  scale_y_continuous(expand = c(0,0)) + 
  coord_flip() +
  theme(axis.title = element_blank()) +
  ggtitle("Frequency per month of the year")

ggplot(brwaa, aes(year)) +
  geom_bar(fill=spectral_colors[1]) +
  geom_text(stat="count", aes(label=..count..), size=2.2, colour="white", hjust=1.3, fontface="bold") +
  bg_and_axes +
  scale_x_continuous(breaks= 2005:2015, expand = c(0,0)) + 
  scale_y_continuous(expand = c(0,0)) + 
  coord_flip() +
  theme(axis.title = element_blank()) +
  ggtitle("Frequency per year")

ggplot(brwaa, aes(year, population, color=district)) +
  stat_summary(fun.y = "mean", geom = "line") +
  bg_and_axes +
  theme(axis.line = element_blank()) +
  ggtitle("Population per district over time")

Univariate Analysis

What is the structure of your dataset?

The dataset is structured around observations (a ‘tidy’ dataset). Each observation represents an incident that was responded to by the fire department. Each observation has 24 variables: a unique identifier and 23 other variables related to the type of incident, the date and time of the incident, the location, and the type of object at which the incident took place.

What is/are the main feature(s) of interest in your dataset?

The mean feature of interest is the type of incident, which varies from automatic fire alarms to drowning animals. It’ll be interesting to see if there are any differences, not just among the different types of incident but also the relation between the type of incident and the place and time of the incident. I won’t look into all the different incidents, but see if there are some that stand out and are worth looking into.

What other features in the dataset do you think will help support your investigation into your feature(s) of interest?

The date and time as well as the general location (neighbourhood) have been well documented. These properties will be very interesting to look at when assessing the different types of incidents, as well as the number of people living in a particular district for a particular year.

Did you create any new variables from existing variables in the dataset?

I’ve added population data per district and year to the data set, so each observation also has as a property, the population for the year and district that observation took place. Also, some cleaning has been done to the categorical variables which, for some reason, had added white space.

Of the features you investigated, were there any unusual distributions? Did you perform any operations on the data to tidy, adjust, or change the form of the data? If so, why did you do this?

When looking at the different distributions, there are a couple of things that seem interesting. Although the distribution doesn’t seem ‘off’ necessarily, it is clear that a couple of things stand out. For the type of incident the first two items, automatic alert (OMS) and outside fire, are the most common incidents with respectively 20903 and 17611 occurences, far above the number three (elevator entrapment) with 9950 occurences. With regard to the neighbourhoods it’s interesting to see that the Bijlmer (also known as the south-east or ‘Zuid-Oost’ district) stands out so much. It is known as a more criminal area with high unemployment and a lot of highrise buildings, but this does seem like a lot of incidents. Something worth exploring.

Furthermore I thought it was interesting to see that Januari and December seem to stand out among the other months. My guess would be that this has to do mostly with the events and incidents around new years eve.

Bivariate Plots Section

I want to start with an overview of the 20 most common incidents versus the most common neighbourhoods. I’ve sorted the top neighbourhoods and incident types and have shown them in the heatmap below, based on the number of occurences of the combinations of the two variables.

Next up we’ll look at the most common incidents per day of the week to see if patterns emerge.

day_of_week_incident_heatmap <- brwaa %>%
  filter(incident_type %in% top_incidents$incident_type[1:20]) %>% 
  group_by(day_name, incident_type) %>%
  summarise(n = n())

ggplot(day_of_week_incident_heatmap, aes(day_name, incident_type)) +
  geom_tile(aes(fill=n), colour = "white") +
  geom_text(aes(label=n), size=2, colour="white") +
  bg_and_axes +
  scale_fill_distiller(palette = "Reds", direction = 1) +
   scale_x_discrete(limits = c("Maandag","Dinsdag","Woensdag",
                              "Donderdag","Vrijdag","Zaterdag","Zondag"),
                   labels = days_of_week_EN, expand = c(0,0)) + 
  scale_y_discrete(expand = c(0,0)) +
  theme(legend.position = "none",
        panel.grid = element_blank(),
        axis.ticks = element_blank(),
        axis.title = element_blank(),
        axis.text.y = element_text(size = rel(0.8)),
        axis.text.x = element_text(angle = 45, vjust = 1.15, hjust = 1, size = rel(0.8)),
        axis.line = element_blank()) +
  ggtitle("Number of incidents per type and day of the week")

Here we can clearly see that the automatic alerts seem to relate to workdays, whereas outside fires seem to, interestingly enough appear in larger numbers on sundays and mondays compared to the rest of the week.

Now, one of the things I find interesting, because it’s so recognisable, is elevator confinement. I can imagine the frustration of being locked in an elevator for hours, and the relief when the firemen finally arrive and retrieve you. So, let’s look into that. First of, the number of occurences per district over time. I have left out the industrial area, ‘Stadsdeel Westpoort’, because it will distort our data when looking at the number of cofinements per 1000 people, and is generally not considered a part of the ‘city’.

elevators <- brwaa %>%
  filter(incident_type == "Elevator confinement") %>%
  filter(district != "" & district != "Stadsdeel Westpoort") %>%
  mutate(month = as.Date(cut(date, breaks="quarter"))) %>%
  group_by(month, district, population) %>%
  summarise(n = n())

ggplot(data = elevators, aes(month, n, fill=district)) +
  geom_area(alpha=0.8) +
  scale_x_date(date_breaks = "1 year", date_labels = "%Y", 
               expand = c(0,0)) +
  theme_minimal() + 
  theme(axis.text.x = element_text(color = "#666666"),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        legend.direction = "horizontal",
        legend.position = "bottom") +
  ggtitle("Number of elevator confinements in Amsterdam") +
  scale_fill_brewer(palette = "RdYlBu")

The interesting thing is not so much that the highest number of elevator confinements is in the Bijlmer (‘Zuid-Oost’ district), after all it is a region with a large number of highrise buildings. What’s interesting is how much it has decreased over the years. Let’s compare the average number of occurences with the number in the Bijlmer. As you can see, there are also a number of peaks. These may be coincidental, but might also be because of a power black-out like on May 29th 2006.

‘Diving’ into the numbers.

The other thing I found interesting, and was something I hadn’t really realised was a part of the work of firemen, is retrieving persons and animals from the water.

water_quarter <- brwaa %>%
  filter(incident_type == "Animal in water" | incident_type == "Person in water") %>%
  filter(district %in% top_district$district[1:10] & district != "") %>%
  mutate(quarter = as.Date(cut(date, breaks="quarter"))) %>%
  group_by(quarter, incident_type) %>%
  summarise(n = n())

ggplot(data = water_quarter, aes(quarter, n, colour=incident_type)) +
  geom_line(aes(color=incident_type), alpha=0.9) +
  geom_point() +
  scale_x_date(date_breaks = "1 year", date_labels = "%Y", 
               expand = c(0,0)) +
  bg_and_axes + 
  theme(axis.text.x = element_text(color = "#666666"),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.line.y = element_blank()) +
  ggtitle("Persons and animals in the water") +
  scale_colour_brewer(palette = "Set1", direction=-1)

It looks like there is a definite seasonal peak for the animals: lot’s of summertime incidents, not so much during winter time. That peak doesn’t seem to be present for people, but then again, there’s not much of a beach in Amsterdam. Let’s look at the monthly averages to see if we can distinguish the seasonal peak.

water_monthly <- brwaa %>%
  filter(incident_type == "Animal in water" | incident_type == "Person in water") %>%
  filter(district %in% top_district$district[1:10] & district != "") %>%
  group_by(month_nr, month_name, year, incident_type) %>%
  summarise(n = n()) %>% 
  group_by(month_nr, month_name, incident_type) %>%
  summarise(avg = mean(n), std = sd(n), n = n())

ggplot(water_monthly, aes(month_nr, avg, colour=incident_type)) +
  geom_ribbon(aes(ymin=avg-std/2, ymax=avg+std/2, linetype=NA), alpha=0.05) +
  geom_line() +
  geom_point() +
  annotate("text", x=1, y=11, label="Standard deviation", size = 1.9, angle = 335, vjust = 1.3, hjust = 0,
           color = "#aaaaaa") +  
  bg_and_axes +
  scale_y_continuous(limits = c(0,16)) +
  scale_x_continuous(breaks=1:12, labels = unique(water_monthly$month_name),
                     expand = c(0,0.2)) +
  theme(axis.text.x = element_text(color = "#666666", angle = 45, 
                                   vjust = 1, hjust = 0.9),
        axis.title.x = element_blank(),
        axis.title.y = element_blank(),
        axis.line.y = element_blank()) +
  ggtitle("Persons and animals in the water (average monthly)") +
  scale_colour_brewer(palette = "Set1", direction=-1)

Bivariate Analysis

Talk about some of the relationships you observed in this part of the investigation. How did the feature(s) of interest vary with other features in the dataset?

We’ve seen a couple of interesting things. First of all, we noticed that a certain district, the Bijlmer, stood out from the rest with regard to elevator confinement, when looking at a heatmap of the district/incident type counts. On further inspection, we noticed that the relation between the Bijlmer (‘Zuid-Oost’) district and the other districts on this particular feature was an odd one. At the beginning of 2006 the number of elevator confinements was way above the average, but it has slowly declined over the years and is now on par with the overall average. We see the same trend when we look at the number of incidents corrected for the number of people living in the district. There too, the number of incidents per 1000 people is declining for the Zuid-Oost district.

Did you observe any interesting relationships between the other features (not the main feature(s) of interest)?

Another feature I looked into was that of retrieving people and animals from the water. There is a very strong seasonal trend for animals, with peaks in the summertime, but this trend was not as present for people. This was something I hadn’t considered before, although it is not hard to imagine why this happens.

What was the strongest relationship you found?

The strongest relationship seems to be the decline in elevator confinements over the years in the ‘Zuid-Oost’ district. Especially compared to the other districts in Amsterdam, and also when taking into account the number of people living there.

Multivariate Plots Section

One thing we haven’t really looked into so far, is using the population data from the dataset. In the next plot I want to use the population data to calculate the frequency of incidents based on the number of people living in a certain district.

#We'll leave out the Westpoort district because it's an industrial area with a very small population (<500)
incidents_per_person <- brwaa %>%
  filter(district != "Stadsdeel Westpoort") %>% 
  group_by(year, district, population) %>%
  summarise(n = n())

ggplot(incidents_per_person, aes(year, n/population*1000, color=district)) +
  geom_point() +
  geom_line() +
  scale_y_continuous(limits = c(0,21)) +
  bg_and_axes +
  ggtitle("Number of incidents per 1000 people per year")

Now let’s go back to our elevator confinement case. We’ll have a look again at the ‘Zuid-Oost’ district versus the average number of confinements. And then we’ll correct for population differences.

ggplot(data = elevators, aes(month, n)) +
  stat_summary(geom="line", linetype=3, fun.y = "mean") +
  annotate("text", x=as.Date("2006-06-01"), y=22, 
                             label="Average number of \nelevator confinements", size = 2) +
  geom_line(data=elevators[which(elevators$district == "Stadsdeel Zuidoost"),], 
             aes(y=n, colour=district)) +
  scale_x_date(date_breaks = "1 year", date_labels = "%Y", 
               expand = c(0,0)) +
  bg_and_axes + 
  theme(axis.text.x = element_text(color = "#666666"),
        axis.title.x = element_blank(),
        axis.title.y = element_blank()) +
  ggtitle("Elevator confinements in Zuid-Oost") +
  scale_fill_brewer(palette = "Set1")

Let’s look at the same data but corrected for the number of people per district

ggplot(data = elevators, aes(month, n/population*1000)) +
  stat_summary(geom="line", linetype=3, fun.y = "mean") +
  annotate("text", x=as.Date("2006-06-01"), y=0.5, 
                             label="Average number of \nelevator confinements", size = 2) +
  geom_line(data=elevators[which(elevators$district == "Stadsdeel Zuidoost"),], 
             aes(y=n/population*1000, colour=district)) +
  scale_x_date(date_breaks = "1 year", date_labels = "%Y", 
               expand = c(0,0)) +
  bg_and_axes + 
  theme(axis.text.x = element_text(color = "#666666"),
        axis.title.x = element_blank(),
        axis.title.y = element_blank()) +
  ggtitle("Elevator confinements in Zuid-Oost per 1000 people") +
  scale_fill_brewer(palette = "Set1")

Now finally we turn to a grand overview of the data to see if we can spot any patterns. We’ll show a heatmap of the top 20 incidents per hour of the day/day of the week.

day_of_week_hour_heatmap <- brwaa %>%
  filter(incident_type %in% top_incidents$incident_type[1:20]) %>% 
  group_by(day_name, hour, incident_type) %>%
  summarise(n = n())

ggplot(day_of_week_hour_heatmap, aes(factor(hour), day_name)) +
  geom_tile(aes(fill=n^(1/2)), colour = "white") +
  #geom_text(aes(label=n), size=2, colour="white") +
  bg_and_axes + coord_equal() +
  scale_fill_distiller(palette = "Reds", direction = 1) +
   scale_y_discrete(limits = c("Zondag","Zaterdag","Vrijdag",
                              "Donderdag","Woensdag","Dinsdag","Maandag"),
                   labels = days_of_week_EN, expand = c(0,0)) + 
  scale_x_discrete(expand = c(0,0)) +
  theme(legend.position = "none",
        axis.ticks = element_blank(),
        axis.title = element_blank(),
        axis.line = element_blank(),
        panel.grid = element_blank()) +
  ggtitle("Number of incidents per day of the week / hour of the day") +
  facet_wrap(~incident_type, ncol = 2)

Multivariate Analysis

Talk about some of the relationships you observed in this part of the investigation. Were there features that strengthened each other in terms of looking at your feature(s) of interest?

When looking deeper into the data on elevator confinement we could see that the change in the Zuid-Oost district was really different from all of the other districts and was also not related to a change in population.

Also when we compared different types of incidents on time of day and day of the week patterns started to emerge. For example, automatic alerts actually occurred mostly during working hours, whereas outside fires occur mostly outside of working hours.


Final Plots and Summary

Plot One

day_of_week_hour_heatmap <- brwaa %>%
  filter(incident_type %in% top_incidents$incident_type[1:20]) %>% 
  group_by(day_name, hour, incident_type) %>%
  summarise(n = n())

ggplot(day_of_week_hour_heatmap, aes(factor(hour), day_name)) +
  geom_tile(aes(fill=n^(1/2)), colour = "white") +
  #geom_text(aes(label=n), size=2, colour="white") +
  bg_and_axes + coord_equal() +
  scale_fill_distiller(palette = "Reds", direction = 1) +
   scale_y_discrete(limits = c("Zondag","Zaterdag","Vrijdag",
                              "Donderdag","Woensdag","Dinsdag","Maandag"),
                   labels = days_of_week_EN, expand = c(0,0)) + 
  scale_x_discrete(expand = c(0,0)) +
  theme(legend.position = "none",
        axis.ticks = element_blank(),
        axis.title = element_blank(),
        axis.line = element_blank(),
        panel.grid = element_blank()) +
  ggtitle("Number of incidents per day of the week / hour of the day") +
  facet_wrap(~incident_type, ncol = 2)

Description One

This first plot shows the different types of incidents that appear in the data per day/hour. I think it gives a great overview of the dataset and a way to spot patterns or interesting things.

Plot Two

Description Two

The second plot I have chosen because it is one of the interesting things to find in this data set. It is a comparison of the average number of elevator confinements to that of a particular district, Zuid-Oost. This comparison shows how much the Zuid-Oost district differs from the other districts even when corrected for population.

Plot Three

Description Three

This plot is another great example of the fascinating things you can find when looking at the fire department data. It shows how something you may not have expected the fire department to do —retrieving animals from the water— is different from retrieving people from the water. Since it has occurred in the same pattern for several years we can conclude that this is not just one hot summer, but a genuine pattern.


Reflection

The subject of this dataset, the work of the fire department, is something I find very interesting to visualise. It enables us to think about the larger story when we hear those sirens drive by. I was intrigued by the depth of the dataset, which meant that it took a while to truly understand what was going on. The first few plots were a way of finding out if there were any interesting angles to explore. What really helped me though, was learning how to properply use a heatmap in R. The ease of visualising patterns for, in this case, the different types of incidents in different parts of the city, has really given me some great insights into the dataset and has helped to dive further into interesting details like the elevator confinement, or retrieving people and animals from the water. Something that I found hard was using so many categorical variables and relatively little continuous variables. This made it harder to find a good way to plot, and that’s where the heatmap came in handy. Adding the population data was a way to find some continuous variable that could potentially also serve as a basis for a model, especially when some more data like crime rates or unemployment is added. However adding that data in the right way took a few hours of diving further into R, though it taught me some valuable skills. Although the data set itself was pretty clean, it still took me quite some time to get everything right, even just figuring out that there were large amounts of whitespace in some places took hours of my time alone. All in all I think this data set and analyses has provided a great way to learn some new visualisation skills and think about what is potentially interesting for an audience. For further exploration I’d look into adding some extra data like crime rates or unemployment, or maybe look for differences between years.